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Abstract. Differential privacy is a notion of privacy that has become 
very popular in the database community. Roughly, the idea is that a 
randomized query mechanism provides sufficient privacy protection if 
the ratio between the probabilities of two different entries to originate a 
certain answer is bound by e''. In the fields of anonymity and informa- 
tion flow there is a similar concern for controlling information leakage, 
i.e. limiting the possibility of inferring the secret information from the 
observables. In recent years, researchers have proposed to quantify the 
leakage in terms of the information-theoretic notion of mutual informa- 
tion. There are two main approaches that fall in this category: One based 
on Shannon entropy, and one based on Renyi's min entropy. The latter 
has connection with the so-called Bayes risk, which expresses the prob- 
ability of guessing the secret. 

In this paper, we show how to model the query system in terms of an 
information-theoretic channel, and we compare the notion of differential 
privacy with that of mutual information. We show that the notion of 
differential privacy is strictly stronger, in the sense that it implies a 
bound on the mutual information, but not viceversa. 



1 Introduction 

The growth of information technology raises significant concerns about the vul- 
nerabihty of sensitive information. The possibihty of collecting and storing data 
in large amount and the availability of powerful data processing techniques open 
the way to the threat of inferring private and secret information, to such an 
extent that fully justifies the users' worries. 



1.1 Differential privacy 

The area of statistical databases has been, naturally, one of the first communities 
to consider the issues related to the protection of information. Already some 
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decades ago, Dalenius [10] proposed a famous "ad omnia" privacy desideratum: 
nothing about an individual should be learnable from the database that cannot 
be learned without access to the database. 

Dalenius' property, however, is too strong to be useful in practice: it has 
been shown by Dwork [11] that no useful database can provide it. In replace- 
ment Dwork has proposed the notion of differential privacy, which has had an 
extraordinary impact in the community. Intuitively, such notion is based on the 
idea that the presence or the absence of an item in the database should not 
change in a significant way the probability of obtaining a certain answer for a 
given query [11-14]. 

In order to explain the concept more precisely, let us consider the typical 
scenario: we have databases whose entries are values (possibly tuples) taken 
from a given universe. A database can be queried by users which have honest 
purposes, but also by attackers trying to infer secret or private data. In order 
to control the leakage of secret information, the curator uses some randomized 
mechanism, which causes a certain lack of precision in the answers. Clearly, 
there is a trade off between the need of obtaining answers as precise as possible 
for legitimate use, and the need to introduce some fuzziness to the purpose of 
confusing the attacker. 

Let /C be the randomized function that provides the answers to the queries. 
We say that K. provides e-differential privacy if for all databases D and D', such 
that one is a subset of the other and the larger contains a single additional entry, 
and for all S C range{K,), the ratio between the probability that the result of 
1C{D) is in S, and the probability that the result of K,{D') is in S, is at most e^. 

Dwork has also studied sufficient conditions for a randomized function K, to 
implement a mechanism satisfying e-differential privacy. It suffices to consider 
a Laplacian distribution with variance depending on e, and mean equal to the 
correct answer [13]. This is a technique quite diffused in practice. 

1.2 Quantitative information flow and anonymity 

The problem of preventing the leakage of secret information has been a pressing 
concern also in the area of software systems, and has motivated a very active line 
of research called secure information flow. Similarly to the case of privacy, also in 
this field, at the beginning, the goal was ambitious: to ensure non-interference, 
which means complete lack of leakage. But, as for Dalenius' notion of privacy, no- 
intereference is too strong for being obtainable in practice, and the community 
has started exploring weaker notions. Some of the most popular approaches are 
the quantitative ones, based on information theory. See for instance [6-8, 15-17, 
21]. 

Independently the field of anonymity, which is concerned with the protection 
of the identity of agents performing certain tasks, has evolved towards similar 
approaches. In the case of anonymity it is even more important to consider a 
quantitative formulation, because anonymity protocols typically use randomiza- 
tion to obfuscate the link between the culprit (i.e. the agent which performs the 
task) and the observable effects of the task. The first notion of anonymity, due 
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to Chaum [5], required that the observation would not change the probabihty 
of an individual to be the culprit. In other words, the protocol should guarran- 
tee that the observation does not increase the chances of learning the identity 
of the culprit. This is very similar to Dalenius' notion of privacy, and equally 
unattainable in practice (at least, in the majority of real situations). Also in this 
case, researchers in the area have started considering weaker notions based on 
information theory, see for instance [3, 18, 22]. 

If wc abstract from the kind of secrets and obscrvablcs, anonymity and of 
information flow are similar problems: there is some information that we want 
to keep secret, there is a system that produces some kind of observable informa- 
tion depending on the secret one, and we want to prevent as much as possible 
that an attacker may infer the secrets from the observables. It is therefore not 
surprising that the foundations of the two fields have converged towards the 
same information theoretical approaches. The majority of these approaches are 
based on the idea of representing the system (or protocol) as an information- 
theoretic channel taking the secrets in input {X) and producing the observables 
in output {Y). The entropy of X, H{X), represents the converse of the a priori 
vulnerability, i.e. the chance of the attacker to find out the secret. Similarly, the 
conditional entropy of X given Y, H{X | Y), represents the converse of the 
a posteriori vulnerability, i.e. the chance of the attacker to find out the secret 
after having observed the output. The mutual information between X and Y, 
I{X;Y) = H{X) — H{X \ Y), represents the gain for the adversary provided 
by the observation, and is taken as definition of the information leakage of the 
system. Sometimes we may want to abstract from the distribution of X, in which 
case we can use the capacity of the channel, defined as the maximum of I{X; Y) 
over all possible distributions on X. This represents the worst case for leakage. 

The various approaches in literature differ, mainly, for the notion of entropy. 
Such notion is related to the kind of attackers we want to model, and to how we 
measure their success (see [15] for an illuminating discussion of such relation). 
Shannon entropy [20] , on which most of the approaches are based, represents an 
adversary which tries to find out the secret x by asking questions of the form 
"does X belong to set ST' . Shannon entropy is precisely the average number of 
questions necessary to find out the exact value of x with an optimal strategy 
(i.e. an optimal choice of the S"s). The other most popular notion of entropy (in 
this area) is Renyi's min entropy [19]. The corresponding notion of attack is a 
single try of the form "is x equal to vT\ Renyi's min entropy is precisely the 
log of the probability of guessing the true value with the optimal strategy, which 
consists, of course, in selecting the v with the highest probability. Approaches 
based on this notion include [21] and [2]. 

It is worth noting that, while the Renyi's min entropy of X, Hao{X), rep- 
resents the a priori probability of success (of the single-try attack), the Renyi's 
min conditional entropy of X given Y, Hoo{X \ Y), represents the a posteriori 
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probability of success^. This a posteriori probability is the converse of the Bayes 
risk [9] , which has also been used as a measure of leakage [1,4]. 

1.3 Goal of the paper 

From a mathcimatical point of view, privacy presents many similarities with 
information flow and anonymity. The private data of the entry constitute the 
secret, the answer to the query gives the observation, and the goal is to prevent 
as much as possible the inference of the secret from the observable. Differential 
privacy can be seen as a quantitative definition of the degree of leakage. The 
main goal of this paper is to explore the relation with the alternative definitions 
based on information theory, with the purpose of getting a better understanding 
of the notion of differential privacy, of the specific problems related to privacy, 
and of the models of attack used to formalize the notion of privacy, in relation 
to those used for anonymity and information flow. 

1.4 Contribution 

The contribution of this paper is as follows: 

— We show how the problem of privacy can be formulated in an information- 
theoretic setting. More precisely, wc show how the answer function K can be 
associated to an information-theoretic channel. 

— We prove that e-differential privacy implies a bound on the Shannon mutual 
information of the channel, and that this bound approach as e approaches 
0. Same for Renyi min mutual information. 

— We show that the viceversa of the above point does not hold, i.e. that Shan- 
non and Renyi min mutual information (and also the corresponding capaci- 
ties) can approach while the e parameter of differential privacy approaches 
infinity. 

1.5 Plan of the paper 

Next section introduces some necessary background notions. Section 3 proposes 
an information-theoretic view of the database query systems. Section 4 show 
the main results of the paper, namely that differential privacy implies a bound 
on Shannon and Renyi min mutual information, but not viceversa. Section 5 
concludes and presents some ideas for future work. 

The proofs of the results are in the appendix. Such appendix will not be 
included in the proceeding version (for reasons of space), but the proofs will be 
made available on line. 

^ We should mention that Renyi did not define the conditional version of the min 
entropy, and that there have been various difi^erent proposals in literature for this 
notion. We use here the one proposed by Smith in [21]. 
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2 Preliminaries 

2.1 Differential privacy 

We assume a fixed finite universe U in which the entries of databases may range. 
The concept of differential privacy is tightly connected to the concept of adjacent 
(or neighbor) databases. 

Definition 1 ([13]). A pair of databases {D',D") is considered adjacent (or 
neighbors ) if one is a proper subset of the other and the larger database contains 

just one additional entry. 

Dwork's definition of differential privacy is the following: 

Definition 2 ([11]). ^ randomized function K, satisfies e-differential privacy if 
for all pairs of adjacent databases D' and D" , and all S C Range{IC), 



2.2 Information theory and interpretation in terms of attacks 

In the following, X, Y denote two discrete random variables with carriers X = 
{.xi, . . . , x„}, y = {ui, . . . ,y„i}, and probability distributions px{-), Py{-), re- 
spectively. An information-theoretic channel is constituted by an input X, an 
output Y, and the matrix of conditional probabilities Py\x{' I ■)) where PY\x{y I 
x) represent the probability that Y is y given that X is x. Wc will use X AY 
to represent the random variable with carrier X x y and joint probability dis- 
tribution pxAY{x,y) =px{x) ■ PY\x{y I x). We shall omit the subscripts on the 
probabilities when they are clear from the context. 

2.3 Shannon entropy 

The Shannon entropy of X is defined as 



The minimum value H{X) = is obtained when p(-) is concentrated on a single 
value (i.e. when p{-) is a delta of Dirac). The maximum value H{X) = log \X\ is 
obtained when p{-) is the uniform distribution. Usually the base of the logarithm 
is set to be 2 and, correspondingly, the entropy is measured in bits. 
The conditional entropy of X given Y is 



Pr[K{D') €S]<e^ X Pr[K,{D") G S] 



(1) 



H{X) = -Y^ p{x)\ogp{x) 



H{X 1 Y) 



^ p{y) H{X\Y = y) 



yey 



where 



H{X\Y = y) 



^ p{x I y) log p{x I y) 
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We can prove that < H{X \ Y) < H{X). The minimum value, 0, is obtained 
when X is completely determined by Y . The maximum value H(X) is obtained 
when Y reveals no information about X, i.e. when X and Y are independent. 
The mutual information between X and Y is defined as 

I{X-Y) = H{X)-H{X\Y) (2) 

and it measures the amount of information about X that we gain by observing 
Y. It can be shown that I{X\ Y) = I{Y; X) and < /(X; Y) < H{X). 

Shannon capacity is defined as the maximum mutual information over all 
possible input distributions: 

C = ma.xI{X;Y) 

px(-) 



2.4 Renyi min-entropy 

In [19] , Renyi introduced an one-parameter family of entropy measures, intended 
as a generalization of Shannon entropy. The Renyi entropy of order a {a > 0, 
a ^ 1) of a random variable X is defined as 

L — a 

We are particularly interested in the limit of Ha as a approaches oo. This is 
called min-entropy. It can be proven that 

H^{X) =^ lim Hci{X) = — logmaxp(a;) 

a— >c» xeX 

Renyi defined also the a-generalization of other information-theoretic no- 
tions, like the KuUback-Leibler divergence. However, he did not define the a- 
generalization of the conditional entropy, and there is no agreement on what 
it should be. For the case a = oo, we adopt here the definition of conditional 
entropy proposed by Smith in [21]: 

H^{X\Y) = - log J2yey Piy) ^^=oex p{x\y) (3) 

Analogously to (2), we can define the mutual information /oo as H^{X) — 
Hca{X \ y), and the capacity Coo as maxp^(.) /oo(X; F). It has been proven 
in [2] that Cqo is obtained at the uniform distribution, and that it is equal to 
the sum of the maxima of each column in the channel matrix: 

Coo = y~] maxp(y ] x). 
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3 An information theoretic model of privacy 

In this section we show how to represent a database query system (of the kind 
considered in differential privacy) in terms of an information-theoretic channel 

According to [11] and [13], differential privacy can be implemented by adding 
some appropriately chosen random noise to the answer x = f{D), where / is the 
query function and D is the database. The function can operate in the entire 
database at once, and even though the query may be composed by a chain of sub- 
queries, we assume that subsequent sub-queries depend only on the true answer 
to previous sub-queries. Under this constraint, no matter how complex the query 
is, it is still a function / of the database D. The scenario where subsequent sub- 
queries can depend on the reported answer to previous queries corresponds to 
adaptive adversaries [11], and is not considered in this paper. 

After the true answer x to the query is obtained from D, some noise is intro- 
duced in order to produce a reported answer y. The reported answer can be seen 
as a random variable Y dependent on the random variable X corresponding to 
the real answer, and the two random variables are related by a conditional prob- 
ability distribution py|x (•!•)• The conditional probabilities 1 a;) constitute 
the matrix of an information theoretic channel from X to Y . 

Figure 1 shows the scheme of implementation of a differential privacy scheme. 



Database 



Query 



Randomization 



Reported Answer 



D 



Value of 
f{D U r) 







2/1 




> 


Channel 


> 




Py\x(-\-) 


2/m 


> 




> 









Randomized value 
IC{f{DUr)) 



Fig. 1. The channel corresponding to a differential privacy scheme. 



In [11] it has been proved that a way to define the values of so to 

ensure e-differential privacy, is by using the Laplace distribution: 

P{{Y = y)\{X = x), Af/e) = ^e"!^--!^/^/ (4) 
where Af is the Ll-sensitivity of /, defined as^ 

Zi/= max \f{D')-f(D")\. 

D' ^D" adjacent 

^ We give here the definition for the case in which the range of / is R. In the more 
general case in which the range is E" we should replace \f{D') — f{D")\ by the 
1-norm of the vector f{D') - f{D"). 
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4 Relation between differential privacy and mutual 
information 

In this section we investigate the relation between differential privacy and information- 
theoretic notions. We start by considering an equivalent definition of differential 
privacy, easier to handle for our purposes. 

4.1 Testing single elements 

Definition 2 considers tests which check whether the result of IC{D) belongs to a 
certain set or not. We prefer to simplify this definition by considering only tests 
over single elements: 

Definition 3. A randomized function K, gives 5-differential privacy if for all 
pairs adjacent datasets D' and D", and all k G Range{K), 

Pr[K{D') = k]<e^ X Pr[K{D") = k] (5) 

The following result shows that our definition of differential privacy is equiv- 
alent to the classical one. 

Theorem 1. A function /C gives e- differential privacy iff it gives 5 -differential 
privacy, with e = 5. 

4.2 Databases with the semie number of entries and differing in at 
most one entry 

Consider two databases D' and D" that have the same number of entries and 
differ in at most one entry as in Figure 2. Let D be the common part shared 
by both databases, and let r' and r" be the rows in which they differ, namely 
D' = DVJ {r'] and D" = DU {r"}. 



D' 













< 


D 


D" < 




D 




r' 






r" 



Fig. 2. Two databases differing in exactly one entry 

We prove that (5-differential privacy imposes also a bound on the comparison 
between databases with the same number of entries, and which differ in the 
values of only one entry. 

Lemma 1. Let JC be a function that gives 6 -differential privacy for all pairs of 
adjacent databases. Given two databases D' and D" that have the same number 
of entries and differ in the value of at most one entry, then: 

Pr[K:{D') =k]< e^* x Pr[K,{D") = k] 
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4.3 Shannon mutual information 

We prove now that (5-differential privacy imposes a bound on Shannon mutual 
information, and that this bound approaches as the parameter 5 approaches 
0. 

Theorem 2. // a randomized function /C gives S- differential privacy according 
to Definition 3, then for every result x* of the function f the Shannon mutual 
information between the true answers X (i.e. the results of f) and the reported 
answers Y (i.e. the results of K) is hounded by: 

I{X; Y) < {e^' + 6-2^)5 log(e) + {e^' - e'^^) logip{y\x*)) 

y 

It is easy to see that the expression which bounds / from above, {e^^ + 
e~'^^)d\og{e) + (e^'' — e~'^^)J2yP{y\x*)\og{p{y\x*)), converges to when S ap- 
proaches 0. 

The converse of Theorem 2 does not hold. One reason is that mutual in- 
formation is sensitive to the values of the input distribution, while differential 
privacy is not. Next example illustrates this point. 

Example 1. Let n be the number of elements of the universe, and m the cardinal- 
ity of the set of possible answers of /. Assume that p{x\) = a and p{xi) = 

for 2 < i < n. Let | xi) = /3, p{yj | xi) = for 2 < j < m, 

and p{yj \ Xi) = otherwise. This channel is represented in Figure 0(a). 
It is easy to see that the Shannon mutual information approaches as a ap- 
proaches 0, independently of the value of (3. Differential privacy, however, de- 
pends only on the value of more precisely, the parameter of differential privacy 
is max{logg loge m^, log^ , log^ "'^^if^ }, and it is easy to see that such 

parameter is unbound and goes to infinity as ^ approaches 0. 

The reasoning in the counterexample above is not valid anymore if we con- 
sider capacity instead than mutual information. However, there is another reason 
why the converse of Theorem 2 does not hold, and this remains the case also if 
we consider capacity. The situation is illustrated by the following example. 

Example 2. Let n be the number of elements of the universe, and m the cardinal- 
ity of the set of possible answers of /. Assume that p{yi \xi) = P and p{yi \ Xj) = 
■^^Si for i 7^ .7- This channel is represented in Figxnc 0(b). It is easy to see that 
the Shannon capacity is C = logm— (1— /3) log(m—l)-|-/3 log ^+(1-/3) log(l— ^), 
and that C approaches as ^5 approaches and m becomes large. Differential 
privacy, however, goes in the other direction when /3 approaches 0, and it is not 
very sensitive to the value of m. More precisely, the parameter of differential 
privacy is maxjlog^ j^^z^, and it is easy to see that such parameter 

is unbound and goes to infinity as /3 approaches 0, independently of the value of 
m. 
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Table 1. The channels 



(b) Example 2 
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of Examples 1 and 2 



4.4 Renyi min mutual information 

We show now that a result analogous to that of Section 4.3 holds also in the 
case of Renyi min entropy. 

Theorem 3. // a randomized function IC gives S- differential privacy according 
to Definition 3, then the Renyi min mutual information between the true answer 
of the function X and the reported answer Y is hounded by 

IooiX;Y) <26\oge. 

The converse of Theorem 3 does not hold, not even if we consider capacity 
instead than mutual information. It is easy to prove, in fact, that Examples 1 
and 2 lead to counterexamples also in the case of Renyi min mutual information 
and capacity. 



5 Conclusion and future work 

In this paper we have shown that the problem of privacy in statistical databases 
can be formulated in information-theoretic terms, in a way analogous to what 
has been done for information flow and anonymity: the database query system 
can be seen as a noisy channel, in the information-theoretic sense. Then we have 
considered Dwork's notion of differential privacy, and we have shown that it is 
strictly stronger than requiring the channel to have low capacity, both for the 
cases of Shannon and Renyi min entropy. It is natural to consider, then, whether 
a weaker notion would give enough privacy guarrantees. As future work, we 
intend to investigate this qiicstion. 

We first need to understand, of course, what are the constraints that could 
be relaxed in the notion of differential privacy. To this aim. Example 2 is quite 
interesting: whenever we get an answer y, there are n — 1 possible inputs (entries) 
which are equally likely to have generated that answer, and one input x that 
is much less likely than the others {p{x\y) = a, where a is a very small value). 
The existence of the latter seems quite harmless, yet it is exactly that entry that 
causes differential privacy to fail (in the sense that its parameter is unbound). 
The notion of Renyi min capacity seems a plausible candidate for the notion of 
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privacy: it's relation with the Bayes risk ensures that a bound Coo can be seen as 
a bound on the probability of guessing the right value of x (given the obsevable). 
In some scenario, this may be exactly what we want. 
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Appendix 

Theorem 4 (Theorem 1 in the paper). A function fC gives e- differential 
privacy iff it gives S-differential privacy, with e = S. 

Proof. 

Let k G Range{JC). Then for all pair of adjacent databases D' , D" we have 



Pr[K{D') = k] = Pr[K{D') e {k}] 

< e'Pr[IC{D") e {fc}] 
= e^Pr[JC{D") = k] 

^ Let S C RangeiJC) 



(taking S* to be a singleton set) 
(by Definition 2) 



Pr[lC{D') eS] = Y^ Pr[fC{D') = k] (by union of elements) 

fees 

< e^Pr[K:{D") = k] (by Definition 3) 



e^ Pr[lC{D") = k\ (by distributivity) 
e^Pr[K{D") e S] (by union of elements) 



□ 



Lemma 2 (Lemma 1 in the paper). Let K. be a function that gives 6- 
differential privacy for all pairs of adjacent databases. Given two databases D' 
and D" that have the same number of entries and differ in the value of at most 
one entry, then: 

Pr[IC{D') =k]< e^'' x Pr[/C(£>") = k] 

Proof. Let us call D the common part that D' and D" share, and let us call r' 
and r" the entries in which they differ, in such a way that D' = D U {r'} and 

Pr[/C(D U {/}) = fc] < e'' X Pr[IC{D) = k] (by Definition 3) 

< e'' X X Pr[IC{D U {r"}) = k] (by Definition 3) 

< e"^ X Pr[/C(D") = k] 

□ 

Theorem 5 (Theorem 2 in the paper). // a randomized function K. gives 
5-differential privacy according to Definition 3, then for every result x* of the 
function f the Shannon mutual information between the true answers X (i.e. 
the results of f) and the reported answers Y (i.e. the results of K) is bounded 
by: 

I{X- Y) < (e^^ + e-2^)(51og(e) + (e^^ - e'^^) ^;.(y|x*) logip{y\x*)) 
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Proof. Let us calculate the Shannon mutual information using the formula I{X] Y) = 
H{Y)-X{Y\X). 



H{Y) = - ^p{y) logp(y) (by definition) 

V 

= - X] ( ^Pi^' y) ) log ( ^Pi^' y) J (by probability laws) 

y \ X / \ X J 

= - ^ I '^p{x)p{y\x) I log I ^p{x)p{y\x) j (by probability laws) 

y \ X J \ X J 

< - ^ j ^p{x)e-'^^p{y\x*) j log {^p{x)e-'^^p{y\x*) j (by Definition 3 and Lemma 1) 

y \ X J \ X j 

= - Ee~''^'(y|^*) (Ef(^)) log 

y \ X J \ X J 

= — e~'^^p{y\x*) log(e~^*p(y|a;*)) (by probability laws) 

y 

= - E (e"'*P(yk*) log e-^^) - J2 {e-^^p{y\x*) logp{y\x*)) (by distributivity) 

V y 

= -e-^'ioge-^' (j2p(y\^*)\ -E(e"'Vyk*)iogp(yk*)) 

\ y / y 

= i5e~^''loge — e^^^^'^p{y\x*)\ogp{y\x*) (by probability laws) 

y 

(6) 

H{Y\X) = — ^^p(a;) ^^p(y|a;) logp(?/|a;) (by definition) 

X y 

> — ^^p(x) e'^^p{y\x*) \og{e'^^ p{y\x*)) (by Definition 3 and Lemma 1) 

X y 

= - ( E ^^^P{y\x*) log(e2*p(y|a;*)) j Y.P^^) (^Y distributivity) 

\ y / X 

= — e^^p{y\x*) \og{e^^ p{%)\x*y) (by probability laws) 

= - E log(e'')) - E logp(yk*)) 

= -e^loge2^ \T^p{x)\x*)\ - ^'Y.^(x,\x*)\ogp{y\x*) 

\ V J y 

= —de^^ loge — e^* ^^p(y|a;*) \ogp{y\x*) (by probability laws) 

y 

(7) 
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I{X; Y) = H{Y) - X{Y\X) (by definition) 

<(5e-2^1oge-e-2^^p(t/|a;*)logp(y|a;*)+ 

X 

25e^^ log e + e^^'^^p{y\x*) log p{y\x*) (by Equations 6 and 7) 

V 

= (e^^ + e-2*)(51og(e) + (e^* - e-^'') ^p{y\x*)\og{p{y\x*)) (by distributivity) 

y 

□ 

Theorem 6 (Theorem 3 in the paper). // a randomized Junction K. gives 
5-differential privacy according to Definition 3, then the Renyi min mutual in- 
formation between the true answer of the function X and the reported answer Y 
is bounded by: 

Ioo{X;Y)<25\oge. 

Proof. Let us calculate the Renyi mutual information using the formula I^{X;Y) = 

H^{x)-xMy)- 

Hoo{X) = —\ogm.ayip{x) (by definition) (g) 

Hoo{X\Y) = - log^p(y) ma.xp{x\y) (by definition) 

y 

= -log^ma.xp{y)p{x\y) 

y 

= — logN maxp{x)p{y\x) (by probability laws) 

y 

> — logy^maxp(x)e^^p(y\x*) (by Definition 3 and Lemma 1) 

y 

= —log}^e^^p{y\x*)Taa,xp(x) 

V 

= - log ^e^'^ m&yip{x)^p{y\x*)^ 

= — log ^e^'^ maxp(a;) j (by probability laws) 

= —2(5 log e — logmaxp(x) 

X 

(9) 

I^{X; Y) = H^{X) - H^{X\Y) (by definition) 

< — log maxp(a;) + 25 log e + log maxp(a;) (by Equations 8 and 9) 

X X 

= 26 log e 

□ 
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